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Detecting changes in Hilbert space data 
BASED ON “repeated” AND CHANGE-ALIGNED 
PRINCIPAL COMPONENTS 

L. Torgovitski* 


Abstract 

We study a CUSUM (cumulative sums) procedure for the detection of changes 
in the means of weakly dependent time series within an abstract Hilbert space 
framework. We use an empirical projection approach via a principal component 
representation of the data, i.e., we work with the eigenelements of the (long run) 
covariance operator. 

This article contributes to the existing theory in two directions: By means 
of a recent result of Reimherr (2015) we show, on one hand, that the commonly 
assumed separation of the leading eigenvalues for CUSUM procedures can be 
avoided. This assumption is not a consequence of the methodology but merely 
a consequence of the usual proof techniques. On the other hand, we propose to 
consider change-aligned principal components that allow to further reduce common 
assumptions on the eigenstructure under the alternative. This approach extends 
directly to multidirectional changes, i.e. changes that occur at different time points 
and in different directions, by fusing sufficient information on them into the first 
component. The latter findings are illustrated by a few simulations and compared 
with existing procedures in a functional data framework. 
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1. Introduction 


Over the last ten years, the analysis of change point problems in a functional data 
framework became a quite popular area of research. The incorporation of projections 
on functional principal components, i.e. eigenfunctions of the (long run) covariance 
operator, is meanwhile a well-established technique for change point procedures and 
there is a considerable amount of literature on this topic such as, e.g., Berkes et al. 
(2009), Hermann and Kokoszka (2010), Aston and Kirch (2012), Horvath et al. (2014), 
Hormann et al. (2015) and the book of Horvath and Kokoszka (2012), just to name a 
few. 

As pointed out in Reimherr (2015) most literature uses the assumption of well- 
separated leading d eigenvalues 

Al > A 2 > ... > Arf > Xd+i (1.1) 

to ensure suitable estimation of principal components up to signs. In the latter article 
it is shown in a rather broad generality that for the estimation of subspaces spanned 
by the d leading principal components only the gap Xd > A^+i is relevant, if these 
subspaces are estimated as a whole and not in each direction separately. We use the 
work of Reimherr (2015) to show how the assumption on distinct eigenvalues can be 
omitted in a common change point framework. The idea is to consider one projection 
of functional partial sums on a d-dimensional subspace rather than d partial sums of 
one-dimensional projections. 

Another contribution is the study of change corrected principal components. Loosely 
speaking, the idea is to correct the leading empirical principal component with an esti¬ 
mate of the change direction such that, on one hand, the component is asymptotically 
oriented in the change direction under the alternative but, on the other hand, still esti¬ 
mates the true principal component under the null hypothesis. Note that the change 
direction represents the most informative subspace in our change point setting. A 
remarkable feature of this alignment is that it allows us to test for more complex multi¬ 
directional changes in exactly the same fashion and under the same set of assumptions. 
This approach can be related to the study of one-dimensional projections of Aston and 
Kirch (2014) under high-dimensionality and also to corrections of covariance estimates 
which are common practice in multivariate change point analysis (cf., e.g.. Remark 3 
of Antoch et al. (1997)). 

We assume that H is a real and separable infinite dimensional Hilbert space. The 
inner product is denoted by {x,y)H and the corresponding norm by ||x||h for x,y € H. 
There should be no confusion, if we simply write {x,y) and ||x|| without the subscript 
H. The mean EX of a Hilbert space random variable X is dehned generally as the 
element that fulfills E{X,y) = {EX,y) for all y G H given that A'||Af|| < 00 . Note that 
we show our results in an abstract setting having the space L^[0,1] of square-integrable 
functions in mind. 

The structure of this article is as follows. We begin with a description of the model and 
of the hypothesis tests in Section 2. In Section 3 we continue to explain the principal 
components approach and then state our theoretical results on the CUSUM test based 
on repeated eigenvalues in Section 4. The change adjusted principal components are 
introduced and studied in Section 5 and afterwards extended to more complex multiple- 
direction alternatives in Section 6. The analysis is complemented by a small simulation 
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study in Section 7. All proofs are postponed to Section 8. 


2. Model and testing problem 

Let {r]i}i£z denote the observable Hilbert space valued time series in a signal plus noise 
model 

r]i = mi + Ei 

with means mi = Erji and a centered, strictly stationary Hilbert space noise Si with 
-E||ei|P < oo. Some weak dependence condition will be imposed further below. We 
assume that = g{i/n)A, i = 1,..., n, where A € H, ||A|| = 1, is the normalized 
direction of the means and s' is a function that describes their magnitude. We assume 
g to be piecewise Lipschitz-continuous and our aim is to test the stability of the means 
mi,..., mn, i.e., the null hypothesis 

Hq : g is a constant function on [0,1] 
against a trend under the alternative specified by 

Ha : 5 is a non-constant function on [0,1]. 

We have the classical abrupt or epidemic settings in mind but consider this broad 
alternative to underpin the generality of our results. Note that this model has been 
considered, e.g., by Horvath et al. (2014) in a related functional framework. They focused 
more on other alternatives but also state or indicate some results in our situation. 

To test Hq versus Ha we will use a CUSUM procedure relying on dimension 
reduction where the basis of the reduced subspaces will be generated by generalized 
Hilbert space principal components. Those have been suggested in a functional two- 
sample context by Horvath et al. (2013) and are motivated by the optimality properties 
of principal components and of their empirical counterparts. To state our results and 
our interpretation of the projection based CUSUM we need to introduce the generalized 
principal components first. 


3. Generalized principal components 

The generalized principal components are eigenelements of the long run covariance 
operator 

^h = E[eQ®eh], 

h£l 

where the tensor operator x ® y is defined via (x (8) y)z = (y, z)x for any x,y,z € H 
and thus is linear in all components. Let || • denote the Hilbert-Schmidt norm. In 
the following we tacitly assume the summability of the lagged covariance operators 

W'^hWs < oo (3.1) 
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which ensures that is a well-dehned positive self-adjoint Hilbert-Schmidt operator. 
Let ^ be for the moment a generic estimate of ^ and assume that ^ is a self-adjoint 
Hilbert-Schmidt operator, too. In this case we may represent both operators using the 
spectral decomposition 

= ^ = 0vj) 

jeN jsN 

with real eigenvalues Xj and Xj. Without loss of generality we may assume Ai > A 2 > 
A 2 > • • • > 0 and also that the vj's form an orthonormal basis of H. We make exactly 
the same assumption on the eigenelements of ^ but do not assume that all Xj are 
non-negative. 

Note that we use the term generalized to distinguish between eigenelements of ^ 
and eigenelements of the covariance operator '^o- The latter will be called standard 
principal components. 


(3.2) 


4. Asymptotic behavior of the CUSUM statistic 


We want to consider the following projection based CUSUM statistics 




max 

l<fc<n 




= max 

l<k<n 


(4.1) 


where is the empirical counterpart of The detectors (4.1) are based on 

projected time series 


r]i = [{Vi,vi),...,{r]i,Vd)]', r)i = [{r]i,vi),...,{r]i,vd)]' (4.2) 

and we write Sk{ri) = J2i=i{'ni ~ 'nn)ln^^‘^ to denote their centered partial sums. Note 
that is not fully observable in practice, since the eigenelements of ^ will rarely 
be known. The matrix S = diag(Ai,..., A^) is the long run covariance matrix of 
and S = diag(|Ai|,..., |Ad|) is its generic estimate. (The diagonality of S is a 
property of the generalized principal components.) Finally, |x| denotes the Euclidean 
norm and thus |x|s = x G is also a norm whenever S has full rank. To 

ensure this regularity we tacitly assume that A^ > 0. Later on we will ensure that 
lim^^oo T’(Ad > 0) = 1 holds true under the null hypothesis, too. We do not require 
the latter under the alternative, but set := 00 if Xj = 0 for some 1 < j < d. 

It is well-known that a functional central limit theorem for the projected time series 
allows to establish weak convergence of under the null hypothesis by means 
of the continuous mapping theorem. Under short range dependence (such as, e.g., strong 
mixing or .if^-m-approximability) the statistics ^('^) and other related detectors 

are studied in, e.g., Berkes et al. (2009), Hermann and Kokoszka (2010), Aston and 
Kirch (2012) or Horvath et al. (2014). Under Hq the following limit holds true 


j^(d) 



sup 

0<x<l 


( ^r(x)) ' , 

\<r<d 


(4.3) 


as n —)• 00 , where {i^r(x),x G [0,1]} are independent standard Brownian bridges. 
Clearly, to establish the same convergence for we just have to replace the eigenele¬ 
ments of 'if with those of 'f. As pointed out by Reimherr (2015) this is usually done by 
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considering the estimation of the eigenstructure in each direction separately and not at 
once. This separated treatment is the origin of some technical issues and in particular 
has the drawback of the separation assumption (1.1) on the eigenvalues. 

We show how one may use the results of Reimherr (2015) to get rid of this assumption 
and impose only the following condition on the last eigenvalue. 


Assumption 4.1. It holds that Xd > A(i_|_i. 


The key to our results is the following CUSUM representation 
= max ||^W5fc(r?)||, = max ||^('^)5fc(r/)|| 


which is based on truncated operators defined by 

d d 

^ X~^^‘^{vj (g) Uj), ^ \Xj\~^^‘^{vj iSiVj), 

i=i i=i 


(4.4) 


(4.5) 


for any d G N. Note that (4.4) just replaces the long run covariance matrices and the 
Euclidean norm in (4.1) with the Hilbert space analogues. The equalities in (4.4) are 
valid in view of Parseval’s identity that implies 


r=l 

OQ d 2 

r=l j=l 

= ^ |A-'/2(5fc(r?),n,)|2 = Sk{v)\‘^ 

i=i 

for all 1 < A; < n. The next proposition is proven by (slightly) adapting the proofs of 
Lemmas 3.1 and 3.2 of Reimherr (2015) where (4.5) is considered with A~^ instead of 


Proposition 4.2. Given that ||^ — ^||5 = op(l) it holds that = op{l) 

as n —)■ oo. 


In the sequel we will use the following bound on the maxima of partial sums 

i<k<n II " C>p(l), (4.6) 

as n —)■ oo. We will discuss this bound in Remark 4.5, below. First, we state the 
asymptotics under the null hypothesis. 


Theorem 4.3. Let \\‘^ — '^\\s = op(l) and Assumption 4-1 hold true. Given (4.3) and 
(4.6), it holds under Hq that, as n ^ oo 


^(d) 



sup 

0<x<l 


( 5] l^^(x)) ' . 

\<r<d 


(4.7) 
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A widely used approach to estimate is via Bartlett-type estimates which are 
defined by 

n 

r=—n 


with = Y^'^ZiVli ~ Vn\ ® [ni+r ~ r > 0 and symmetrically with = 

Yl'i=iVli-r — i)n] ® [^i — Vn]/n for r < 0. The bandwidth h = £ N fulfills h —)■ oo as 

n —>■ oo but h = o(n). The kernel is symmetric, i.e. kXf{x) = for x £ M, and 

piecewise continuous with =1^(0) = 0. Moreover, it is continuous at x = 0, bounded by 
\J(f{x)\ < c for some constant c > 0 and has a bounded support such that (x) = 0 
for |x| > a for some a > 0. 

To be more specific about the conditions of Theorem 4.3 and some of the subsequent 
theorems we consider the following weak dependence concept. 

Definition 4.4. The time series {£i}i£'z is ^^-m-approximahle if the following condi¬ 
tions are all satished: 1) It holds that £'||eoP < oo, for some p > 1. 2) It holds that 
Si = /(Ci, Ci-1) Ci- 2 ) • • •) where / : 5“ —>■ if is a measurable mapping from some mea¬ 
surable space S and where the innovations Ci are i.i.d. S'-valued random elements. 3) It 
holds that X]m=i[-®lko — < oo where are m-dependent copies of e* de¬ 

fined by = /(O, ..., Ci-m+iXi-m^ Q_(m+i)’ • • •): using a family {Cr, Cj, hJ, r £ Z} 
of i.i.d. random variables. 

Remark 4.5. In Theorem 4.3 we have to ensure that (3.1), (4.3), \\^—^\\s = op(l) and 
that (4.6) hold true. Let {£i}i£z be .if^-m-approximable. The summability assumption 
(3.1) is shown, e.g., in Hermann et al. (2015) and (4.3) follows from Theorem A.l of Aue 
et al. (2009). Using the above Bartlett-type estimates the W'lf — = op(l) bound 

follows similarly to Horvath et al. (2013) and rates can be obtained, e.g., via Horvath 
et al. (2014), Berkes et al. (2015) or Hormannet al. (2015) under .jSf^-m-approximability 
with (partly) some additional but mild assumptions. We refer also to Panaretos and 
Tavakoli (2013) for another broad setting. The bound (4.6) follows from Jirak (2013) 
or from Berkes et al. (2013). 

Note that the conditions — X'Ws = op(l) and maxi<fc<„ ||S'fc(e)|| = Op(l) are 
directly related to those mentioned in Remark 3.1 of Aston and Kirch (2012) but here 
on the operator and on the Hilbert space level. 

Next, we state the asymptotics under the alternative. 

Proposition 4.6. Assume that we use estimates such that |(i}fc,A)| = c + op(l) 
holds true for some c > 0 and that = op{n) is fulfilled for some 1 < k < d. Given 
(4.6), it holds under Ha that 

lim >t) = l (4.8) 

n^oo 

for any threshold t £ M. 

Above assumptions can be ensured using Bartlett-type estimates, e.g., under the 
concept of .J^’^-m-approximability. The next proposition is related to the statement 
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in (3.14) of Horvath et al. (2014). As already mentioned, the trend-alternative case 
is indicated in the latter article, too. Note that no additional assumptions on the 
eigenvalues are necessary. 

Proposition 4.7. Assume that {eijigz are ^^-m-approximahle and that we use esti¬ 
mates with J(^{x)dx / 0. Then, (4.8) holds true under Ha for any d G N. 

In the i.i.d. case it appears more natural to use the estimate for rather 

than ^B- However, a well-known issue is that not all changes can be detected then. 
In particular the condition Khfc, A)| = c-\- op(l), c > 0, for some 1 < A: < d, needs to 
be imposed and is quite difficult to verify. For example it is discussed in Berkes et al. 
(2009) and Aston and Kirch (2012) that this condition is fulfilled for changes A that 
are not orthogonal to all ui,..., Vd and also for suffieiently large trends. 

One remedy is to use the estimate ’^b instead of which is discussed in Horvath 
et al. (2014). It forces the leading eigenvector Di to align with the change direction. We 
discuss another more direct approach in the next subsection that relies on a correction 
of the first eigenvector. This correction is not restricted to and can be applied to 
eigenvectors of too. 


5. Change-ALIGNED pringipal gomponents 
We define the aligned first principal component by 


v'l = [vi/n'^su\/\\vi/n'^su\\ ( 5 . 1 ) 

for some 7 G (0,1/2), where u = , s = sign({)i,ft), and where k is chosen 

such that ||5'^(7)|| = maxi<fc<„ ||5fc(r/)||. Under the null hypothesis and given (4.6) it 
behaves asymptotically like the empirical principal component hi whereas under the 
alternative it aligns with the change direction of A. The speed of the alignment is 
controlled via the parameter 7 . Clearly, a larger 7 slows down the convergence under 
the null but speeds it up under the alternative. Using h( we define the adjusted statistic 

= max |5fc(t7')lE 

l<k<n 

where 77' = [{rji, v[), {r}i,V2), • ■ •, (^i, hrf)]', i.e. with hi being replaced by h( but leaving all 
other components and in particular all the eigenvalue estimates unchanged. Note that 
representation (4.4) is not valid anymore for since the orthogonality is generally 

violated but this is not an issue in our context. 

Corollary 5.1. Under the assumptions of Theorem f.S the asymptotics (4.7) hold true 
for under Hq. 

Theorem 5.2. Assume that we use estimates ‘if such that Ai = op{n) holds true as 
n —)• 00 . Given (4.6), it holds under Ha that 

lim P(#W' >t) = l 

n^oo 

for any threshold t G M. 
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The assumptions of Theorem 5.2 are fulfilled in the i.i.d. case if we work with 
= "^0 (cf., e.g., Berkes et al. (2009)). Under .if'^-m-approximability, they follow from 
the proof of Proposition 4.7 if we use ^ with J^{x)dx / 0. 


6. Extension to multiple change directions 

Another interesting advantage of the change-aligned principal components is that we 
may test far more complex alternatives within the same set of assumptions as in 
Theorem 5.2. Consider the model 

rrii = gi{i/n)Ai + ... + gg{i/n)Ag ( 6 . 1 ) 

where Ai,..., G 77, ^ G N, are orthonormal directions and gt are piecewise Lipschitz 
continuous functions. The general test is now Hq: all gi are constant on [0,1] versus 
Ha'- at least one gi is non-constant on [0,1]. Proceeding as under the one direction 
framework one may restate Theorem 5.2. Note that the proof essentially relies on the 
convergence 111111 —^ > 0, as n —)■ oo, where u is defined in (5.1) and where in this 
general case 

^:= sup ||g 3 i(x)Ai + ... + gg (x)Agf = sup \QgAx)? + ■ ■ ■+ \G9,{x)?■ 

0<x<l 0<a:<l 

Here, Qg^ denote as defined in (8.2), but with respect to trends gi- The equality 
holds due to Parseval’s identity and due to the orthonormality of the change directions. 
Furthermore, using Bartlett-type estimates we can repeat (with some notational 
effort) Lemma 8.2 and Theorem 8.3, e.g., under Af^-m-approximability, to show that 
all assumptions of the analogue of Theorem 5.2 are fulfilled, too. 

Loosely speaking, the information on all changes is condensed into one direction. 
This result may even be of some interest in a multivariate framework. 


7. Simulations 

In this section we illustrate our findings and compare the test based on aligned principal 
components with the principal component approach of Berkes et al. (2009) and of 
Horvath et al. (2014) in a functional framework with H = L^[0, Ij. For the simulations 
we use R. We consider independent square-integrable functional observations generated 
from sample paths of standard Brownian motions on [0, Ij. Those are computed using 
the el071-package and then converted into functional objects with the fda-package using 
a Fourier basis of 25 basis functions which results in a rather smooth representation. 
For our simulations under the alternative we use piecewise linear trend functions 

9 [ei,e 2 ]ix) = {x- 6»i)(6»2 - -7 0 < 6»i < 02 < 1. 

The particular trend is not important for our discussion and epidemic or other alterna¬ 
tives could be chosen as well. First, we show the effects of the change-alignment by using 
a sample of n = 200. We consider an abrupt change setting with g{x) = ^g[i/2,i/2]{x) 
and the change direction A = {vio -|- un -|- ui2)/3^/^. Some of these functional obser¬ 
vations are illustrated in Figure 1 which shows them before and after the change. The 
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Figure 1: Functional observations with and without a functional change. 


change direction is chosen with a sole purpose: it is not aligned with any of the leading 
true population principal components of the data. 

The same data is used for Figure 2 which shows the four leading empirical standard 
(or generalized) principal components Vj and Vj^B and the corresponding change-aligned 
hrst components v'^ and v'^ Those are computed either with or using a flat-top 
Bartlett estimate with kernel J^{x) = l{|a.|<i}(x) and bandwidth h = \ v}/^\. For 
the change-aligned components we choose 7 = 2/5. In Figure 2 we see that that the 
alignment has only little effect under Hq but a rather strong impact under Ha- We 
also see that the first standard (and the hrst generalized) principal components are 
not capturing the large but quite oscillating change - even though the hrst direction 
explains about 82% (73%) of the variability in the change-contaminated data. This 
is substantially improved for the aligned counterparts (in the hrst row) which move 
both, as expected, in the right direction. Note that the generalized empirical principal 
components would also push the change more into the hrst component and achieve a 
comparable effect under a larger sample size or a larger bandwidth. 

Next, we compare the performance of the test based on change-aligned components 
with the usual ones, i.e. with the approach considered in, e.g., Berkes et al. (2009) and 
Horvath et al. (2014). The critical values are based on the limiting distribution (4.7) 
and may be obtained, e.g., from Table 2 of Kiefer (1959). The reported rejection rates 
in Table 1 are based on 1000 repetitions. 

Remark 7.1. Note that the aligned principal components ensure that the hrst projec¬ 
tion contains asymptotically all information on any change A (and sufficient information 
on any changes Ai,..., A^). Thus, we propose to consider only this direction for testing 
and choose d = 1 which has the additional advantage that no selection criteria for d is 
necessary. As already mentioned in the introduction, one-dimensional projections are 
also considered by Aston and Kirch (2014). The authors indicate in their Conclusion- 
Section that (random) projections related to principal components might be of some 
further interest. Our construction of hj can be regarded as an explicit example of such 
a projection. 
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V3,B 
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V4,B 
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Ha standard 
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Figure 2: Empirical standard principal components wi ,... ,V 4 under Hq and Ha (1st and 
2nd columns) and empirical generalized principal components vi,b, ■ ■ ■, V 4 ,b under Hq 
and Ha (3rd and 4tli columns) in ascending order. The dotted red lines indicate the 
true principal components vi,... ,V 4 and the dotted black lines additionally the change 
direction A under the alternative. The Hrst row shows the aligned 1st components v '4 
and v[ g. 
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Vl 

v[ 

Vl,B 

«'l.B 

Vl 


Vl,B 

v'lm 

Vl 

V'l 

Vl,B 

K,b 

n 




A 




B 




C 

100 

7.9 

10.0 

7.3 

7.9 

69.5 

70.6 

66.0 

67.2 

11.6 

96.9 

64.7 

96.4 

200 

8.5 

8.1 

8.7 

9.1 

95.1 

95.4 

94.1 

94.4 

13.6 

100 

65.7 

100 

300 

9.5 

9.9 

9.4 

10.5 

99.0 

99.1 

99.0 

99.0 

13.6 

100 

92.7 

100 

400 

9.2 

8.3 

8.4 

8.9 

99.8 

99.8 

99.0 

99.0 

14.3 

100 

95.5 

100 

500 

9.9 

9.0 

9.1 

9.4 

100 

100 

100 

100 

13.8 

100 

96.6 

100 





D 




E 




F 

100 

41.3 

44.1 

40.6 

42.0 

52.8 

57.1 

52.9 

56.6 

10.7 

36.9 

24.9 

51.6 

200 

70.0 

68.0 

64.8 

65.9 

78.7 

86.5 

80.6 

87.4 

9.5 

87.4 

24.3 

90.1 

300 

85.2 

87.3 

85.7 

86.3 

91.7 

97.3 

93.1 

97.7 

13.2 

100 

41.3 

100 

400 

94.3 

94.1 

93.2 

93.5 

96.6 

99.6 

98.0 

99.6 

9.6 

100 

42.4 

100 

500 

97.8 

98.2 

97.3 

97.5 

99.1 

99.8 

99.6 

99.8 

11.6 

100 

40.8 

100 


Table 1: Empirical rejection rates in percent. The critical value is chosen such that 
10% are rejected asymptotically under i?o- 


In Table 1 we consider the following settings where we include sin{t) and t because 
both alternatives were considered in Berkes et al. (2009): 

A. Ho, 

B. A(t) = csin(t) and g = lg[i/ 2 ,i/ 2 ], 

C. A(t) = vw{t) and g = ^[ 1 / 2 , 1 / 2 ], 

D. A(t) = ct and g = |fif[i/ 3 , 2 / 3 ], 

E. A(t) = ccos(t) and g = 55 ( 1 / 3 , 2 / 3 ]; 

F. Ai(t) = vio{t), A 2 {t) = vi 5 {t) and 51 = ^ 5 ( 3 / 5 , 1 ], 52 = ^5(1/3, 2 / 3 ]- 

The constant c is always chosen such that A is normalized. The scaling of the trend 
functions is always chosen such that the increase in the power is visible when the sample 
size n increases. We see in Table 1 that the size remains stable under alignment (column 
A) and that the power is not affected whenever it is already high without alignment 
(columns B, D and E) but may substantially increase if it is not (columns C and F). 


Conclusion 

From a theoretical point of view, we reduced the assumptions on the standard CUSUM 
procedure in a functional data setting under the null and under the alternative hy¬ 
potheses in a flexible change in the mean framework. From a practical point of view, we 
proposed change-aligned principal components and demonstrated that they are useful 
modifications of the common principal component approach. 
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8. Proofs 


Proof of Proposition f.2. First, we define the mixed operator 

^ ® vj) 

i=i 


and observe by the mean value theorem that < {x — y)^/(min(x, y))^ 

for x,y > 0. Hence, slightly modifying the proofs of Lemmas 3.1 and 3.2 of Reimherr 
(2015), we get 




Ad 


+ 


+ 


(Ad Ad+i)^ (Ad — Ad+i)^ A^ 


and 


_ <i(d) III < ^||,^ _ 

Since \Yi’ — ‘^Ws = op(l) we know that Ad = Ad + op(l) and Ad+i = Ad+i + op(l) and 
the proof is complete since Ad > Ad+i >0. □ 

Proof of Theorem 4-3. It holds under the null hypothesis that 


max \Sk{v)\ 

l<k<n 


E — 




max |5fc(e)|s - max \Sk{i)\f^ 

l<k<n l<k<n 


max 

l<Ai<n 




max 

l<fc<n 




< 


max 

l<k<n 




< IIV'-O -yWlk maxJ|St{e)|| < II^W maxJ|Si(e)||, 


for some c > 0, where || • \\c is the operator norm. The assertion follows on using 
Proposition 4.2. □ 

To investigate the behavior under the alternative we need a weighted law of large 
numbers. The proof is obvious and thus omitted. 


Proposition 8.1. Let {eijiez be an H-valued, centered time series and let 

he an array of scalars such that maxi<j<„ \an,i\ = 0{n~^). Given = 

0{n) it holds that E\\ Y17=i = 0{n~^). 

The assumption of Proposition 8.1 on the noise is fulhlled under .if^-m-approx- 
imability according, e.g., to Hormann et al. (2015). 

Subsequently, we make use of the following functions 

^( 5 )=^ 9‘^{y)dy - g{y)dy^ , ( 8 . 1 ) 
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and 


rx rl 

G{x) = / g{y)dy -x g{y)dy. 
Jo Jo 

Furthermore, for convenience, we set 

Pn,i = g{i/n) - [ g{x)dx 
Jo 


( 8 . 2 ) 


(8.3) 


and recall that under Ha the function g is non-constant in which case ^{g) > 0 and 


5^ := sup \Q{x)\ > 0, (8.4) 

0<x<l 

as well. For the sake of a more compact presentation we restrict ourselves to Lipschitz 
continuous trend alternatives. The piecewise case may be treated similarly. 


Lemma 8.2. Assume that g is Lipschitz continuous and set the /3n,i according to (8.3). 
It holds that 


lim max 

n^oo 0<r<h 


Pn,iPn,i+r “ ^(ff) = 0, 

where /i —)■ oo with h = o(n), as n —)• oo. 

Proof of Lemma 8.2. Using the Lipschitz continuity of g we get that 

n—r 

max I V I3n,il3n,i+r/n -^{g)\ 

Q<r<h —■ 

“ “ 2=1 

n—r / n—r k 

< max \Y I3n,il3n,i/n-^{g) + max ( Y\f^n,i\\Pn,i - Pn,i+r\/n] 
0<r<h —: 0<r<h \ —: / 

“ “ 2=1 “ “ 2=1 

n n 

<\Yf^n,ildn,i/n-^{g) -h max I V f3n,i/3n,i/n 

—: 0<r<h — , , 

2=1 ~ ~ 2=22 — r +1 

-h ( max max \(3n,i - ldn,i+r\ ) ( Y \l^n,i\/n ] = 0{h/n) 

V 0<r<h l<i<n—r ) \ ^ / 

2 = 1 

which completes the proof. 

The next theorem is an extension of Lemma B.2 of Horvath et al. (2014). 


(8.5) 


□ 


Theorem 8.3. Let all assumptions of Proposition f.l hold true and assume that ||‘^ — 
^\\s = op(l) under Hq. It holds under Ha that, 




= op{h), 


( 8 . 6 ) 


as n ^ oo, with Sn = '^{g) J2r=-n ^ = A (g) A. 
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Proof of Theorem 8.3. We sketch the proof only for a Lipschitz continuous g. Set the 
f3n,i according to (8.3) and observe that 

hi+r Vn — ^i+r “k Pn,i^ “k (^•^) 

where = o(l) is uniform in i and r. Let us decompose as follows 

iVi Vn) ® {Vi+r Vn) —^n) ® (^-i+r ^n) “k I3n,il^n,i+r{,^ ® ‘^) (^•^) 

“1“ Pn^i+r {£i (g) A) + /3„,i(A (g) £i+r) + Rn (8.9) 

with a remainder Rn which is of order opih) due to Yl'i=i l^n,i£i/n = op(l) and the 
latter being a consequence of Proposition 8.1. The proof is finished on showing that 
the term (3n,i/3n,i+r{^ ® A) is dominating. It holds that 


1 je{r/h) y: Pn.,i-\-r ® AI /tZ 

r=l i=l 

n—r /ii—r \ 

< [ch\ max V/3n,iei /n+^[ V |/3n,i -/3n,i+r| Ikill/^ ) 

l<r< ch II 1 \ • 1 / 

— — L J 1=1 r=l '*=1 


S 

\_ch\ ^ n—r 


( 8 . 10 ) 


< [chj max V I3n,i£i / 

l<r< \ch\ II —f 
— — 1=1 


<n 


. n V 

+ [ch\ max ( max \l3n,i - (3n,i+r\) ( V] ||ei||/n ) = op{h) 

\<r<\ch\ \ l<i<n—r / V / 


for some c > 0. The rate op{h) is shown as follows: First, using Proposition 8.1 and 
TJi=n-lch\+i l/3n,i|Li||ei|| = 0{h) we observe that 


max^ J| ^/n < II /n + ^ |/Sn,i| Ikill/n = C>p(/i/’^)- 

i=l i=n—lch\-\-l 


l<r<\ch\ II 
— — 2 = 1 


Second, by the Lipschitz continuity and the law of large numbers we get 


max 

l<r<\c.h\ 


( max I Bn i 



0{hln). 


An application of Lemma 8.2 completes the proof. 


( 8 . 11 ) 

□ 


For the next proofs we note that we have uniform convergence towards partial sums 
of expectations, in view of (4.6), and they converge uniformly via piecewise Lipschitz 
continuity of g towards an integral expression involving (8.2): 

sup ||5|^(?7)/n^/^ - ^(x)A|| = C)p(n“^/^) = op(l). (8.12) 

0<a;<l " 

Proof of Proposition f.6. We consider the projection in the A:-th direction 

11#'^) 5 ^( 77 )II > |(5fe(r7)/n^/^,{)fc)||?^/Afc|^/^. 

A combination of (8.4), (8.12) and of the assumptions on % and yields the assertion. 

□ 
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Proof of Proposition f.7. Similar to the proof of Proposition 4.6 we may consider the 
relation 

Sk{ri)\\ = Sk{ri)/n^/‘^\\ > 

From Theorem 8.3 and the Proposition 4 of Hermann et al. (2015) we obtain that 
ll^s/h — K^Ws = op(l) with K = W{g) f^^J(f{x)dx and thus \i/h = k + op(l), 

A 2 = op(l) and also ||{)i — sA|| = op(l) with s = sign({)i, A). Hence, it is — 

^- 1 / 2 ^ 11 ^ = op(l). Moreover, we know from (8.12) that 

max ||^S'fc(r/)/n^/^|| = =5^ + op(l) 

l<k<n 

with =5^ > 0 and the proof is complete. □ 

Proof of Corollary 5.1. The assertion follows immediately from Theorem 4.3 observing 
that 


< max |Sfc(e) - 5fc(e')l 


l<k<n 


= max ||5fc(e)||||i)i - hill/IAil^/^ 

l<k<n 


= op(l), 


Ihillll — IIDi + n'>'s{t||| + lln'^ril 


Ihi + n'^su\\ 


|Al|l/2 


since ||i;i|| = 1 and n'^u = Op{n ^Z^+'i') = op(l) in view of (4.6). We used that 
rCu = C)p(n“^/^+'^) = op(l) since 7 G (0, 1 / 2 ) and that Ai = Ai + op(l) if — ^\\s = 

Op(l). □ 


Proof of Theorem 5.2. Relation (8.12) yields that 
111111 = max \\Sk{r])/n^^‘^\\ = + op{l) 

l<k<n 

and also that ||'0i/n'’'+sn|| = =y'+op(l) with^ > 0. Due to it is sufficient 

to consider the behavior of the statistic in the first direction which corresponds to 


^( 1 )'= max |(S'fc(r/)/n^/^,' 0 i/n'’'+ su)! 
l<fc<n 


77 , 




> max |(S'fc(r 7 )/n^/^,Di/n'^)| 

l<fc<n 


-max \{Sk{v)/n^^'^,u)\ 

l<k<n 


vifn'f + su\ 

n/Ai|V2 


Il'Oi/n'^ + su\ 
Via (8.12) we observe, for one thing, that 


|(S’fc(^?)/nl/^^)l/n^')| 
max -^-r-T^- < 




i<fc<n ll'Di/n'*'+ silll ll'Di/n'*'+ s{t|| n"i 

and, for another thing, by evaluating at k = k, that 
\{Sk{r])/n^/'^,u)\ 


= Op{n-^) 


max 


i<fc<n ll'Di/n'*'+ s{i| 


> m 


ru 


Ihi/n'T' + HI 


— SP P op(l)j 


(8.13) 


(8.14) 


(8.15) 
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The claim follows by (8.4) and by recalling the assumption that P(|n/Ai| > 1/c) = 
P(|Ai/n| < c) —)■ 1, for any c > 0. 

In contrast to the proof of Proposition 4.7 we did not explicitly use the asymptotic 
behavior of 0^. □ 
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